虽然最近的基于NERF的生成模型实现了不同的3D感知图像的产生,但这些方法在生成包含用户指定特征的图像时具有限制。在本文中,我们提出了一种新颖的模型,称为条件生成神经辐射场(CG-NERF),其可以生成反映诸如图像或文本的额外输入条件的多视图图像。在保留给定输入条件的常见特征的同时,所提出的模型以精细的细节生成不同的图像。我们提出:1)一种小说统一的架构,它从各种形式和2)以各种形式和2)给出的姿势一致的分集损失,用于在保持视图的一致性的同时产生姿势 - 一致的分集损失。实验结果表明,与现有的基于NERF的生成模型相比,该方法对各种情况类型的图像质量保持一致的图像质量,并实现了卓越的保真度和多样性。
translated by 谷歌翻译
For ensuring vehicle safety, the impact performance of wheels during wheel development must be ensured through a wheel impact test. However, manufacturing and testing a real wheel requires a significant time and money because developing an optimal wheel design requires numerous iterative processes to modify the wheel design and verify the safety performance. Accordingly, wheel impact tests have been replaced by computer simulations such as finite element analysis (FEA); however, it still incurs high computational costs for modeling and analysis, and requires FEA experts. In this study, we present an aluminum road wheel impact performance prediction model based on deep learning that replaces computationally expensive and time-consuming 3D FEA. For this purpose, 2D disk-view wheel image data, 3D wheel voxel data, and barrier mass values used for the wheel impact test were utilized as the inputs to predict the magnitude of the maximum von Mises stress, corresponding location, and the stress distribution of the 2D disk-view. The input data were first compressed into a latent space with a 3D convolutional variational autoencoder (cVAE) and 2D convolutional autoencoder (cAE). Subsequently, the fully connected layers were used to predict the impact performance, and a decoder was used to predict the stress distribution heatmap of the 2D disk-view. The proposed model can replace the impact test in the early wheel-development stage by predicting the impact performance in real-time and can be used without domain knowledge. The time required for the wheel development process can be reduced by using this mechanism.
translated by 谷歌翻译
外围插入的中央导管(PICC)由于其长期的血管内渗透感具有低感染率,因此已被广泛用作代表性的中央静脉线(CVC)之一。但是,PICC的尖端错位频率很高,增加了刺穿,栓塞和心律不齐等并发症的风险。为了自动,精确地检测到它,使用最新的深度学习(DL)技术进行了各种尝试。但是,即使采用了这些方法,实际上仍然很难确定尖端位置,因为多个片段现象(MFP)发生在预测和提取PICC线之前预测尖端之前所需的PICC线的过程。这项研究旨在开发一种通常应用于现有模型的系统,并通过删除模型输出的MF来更准确地恢复PICC线路,从而精确地定位了检测其处置的实际尖端位置。为此,我们提出了一个基于多阶段DL的框架后处理,以后处理现有技术的PICC线提取结果。根据是否将MFCN应用于五个常规模型,将每个均方根误差(RMSE)和MFP发病率比较性能。在内部验证中,当将MFCN应用于现有单个模型时,MFP平均提高了45%。 RMSE从平均26.85mm(17.16至35.80mm)到9.72mm(9.37至10.98mm)的平均增长了63%以上。在外部验证中,当应用MFCN时,MFP的发病率平均下降32%,RMSE平均下降了65 \%。因此,通过应用提出的MFCN,我们观察到与现有模型相比,PICC尖端位置的显着/一致检测性能提高。
translated by 谷歌翻译
GPT-3显示了培训的大规模语言模型(LMS)的卓越情调学习能力,培训数十亿规模数据。在这里,我们解决了GPT-3纸张报告的一些剩余问题,例如非英语LM,不同大小模型的性能,以及最近引入的迅速优化对上下文学习的效果。为实现这一目标,我们介绍了HyperClova,一个韩国VPT-3的韩国变体训练在一个以韩国为中心的560b标准的令牌。通过我们的韩国特定标记化,HyperClova与我们的培训配置增强,显示了韩国各种下游任务的最先进的上下游零射击和几秒钟学习表演。此外,我们展示了基于及时的学习的性能优势,并演示如何集成到迅速的工程管道中。然后,我们讨论了通过引入Hyperclova Studio,互动提示工程界面向ML的非专家提供AI原型设计能力来实现No Code AI范例的可能性。最后,我们展示了我们具有三个成功的内部应用程序的方法的潜力。
translated by 谷歌翻译
Dataset distillation has emerged as a prominent technique to improve data efficiency when training machine learning models. It encapsulates the knowledge from a large dataset into a smaller synthetic dataset. A model trained on this smaller distilled dataset can attain comparable performance to a model trained on the original training dataset. However, the existing dataset distillation techniques mainly aim at achieving the best trade-off between resource usage efficiency and model utility. The security risks stemming from them have not been explored. This study performs the first backdoor attack against the models trained on the data distilled by dataset distillation models in the image domain. Concretely, we inject triggers into the synthetic data during the distillation procedure rather than during the model training stage, where all previous attacks are performed. We propose two types of backdoor attacks, namely NAIVEATTACK and DOORPING. NAIVEATTACK simply adds triggers to the raw data at the initial distillation phase, while DOORPING iteratively updates the triggers during the entire distillation procedure. We conduct extensive evaluations on multiple datasets, architectures, and dataset distillation techniques. Empirical evaluation shows that NAIVEATTACK achieves decent attack success rate (ASR) scores in some cases, while DOORPING reaches higher ASR scores (close to 1.0) in all cases. Furthermore, we conduct a comprehensive ablation study to analyze the factors that may affect the attack performance. Finally, we evaluate multiple defense mechanisms against our backdoor attacks and show that our attacks can practically circumvent these defense mechanisms.
translated by 谷歌翻译
Blind image quality assessment (BIQA) remains challenging due to the diversity of distortion and image content variation, which complicate the distortion patterns crossing different scales and aggravate the difficulty of the regression problem for BIQA. However, existing BIQA methods often fail to consider multi-scale distortion patterns and image content, and little research has been done on learning strategies to make the regression model produce better performance. In this paper, we propose a simple yet effective Progressive Multi-Task Image Quality Assessment (PMT-IQA) model, which contains a multi-scale feature extraction module (MS) and a progressive multi-task learning module (PMT), to help the model learn complex distortion patterns and better optimize the regression issue to align with the law of human learning process from easy to hard. To verify the effectiveness of the proposed PMT-IQA model, we conduct experiments on four widely used public datasets, and the experimental results indicate that the performance of PMT-IQA is superior to the comparison approaches, and both MS and PMT modules improve the model's performance.
translated by 谷歌翻译
The development of social media user stance detection and bot detection methods rely heavily on large-scale and high-quality benchmarks. However, in addition to low annotation quality, existing benchmarks generally have incomplete user relationships, suppressing graph-based account detection research. To address these issues, we propose a Multi-Relational Graph-Based Twitter Account Detection Benchmark (MGTAB), the first standardized graph-based benchmark for account detection. To our knowledge, MGTAB was built based on the largest original data in the field, with over 1.55 million users and 130 million tweets. MGTAB contains 10,199 expert-annotated users and 7 types of relationships, ensuring high-quality annotation and diversified relations. In MGTAB, we extracted the 20 user property features with the greatest information gain and user tweet features as the user features. In addition, we performed a thorough evaluation of MGTAB and other public datasets. Our experiments found that graph-based approaches are generally more effective than feature-based approaches and perform better when introducing multiple relations. By analyzing experiment results, we identify effective approaches for account detection and provide potential future research directions in this field. Our benchmark and standardized evaluation procedures are freely available at: https://github.com/GraphDetec/MGTAB.
translated by 谷歌翻译
Given the increasingly intricate forms of partial differential equations (PDEs) in physics and related fields, computationally solving PDEs without analytic solutions inevitably suffers from the trade-off between accuracy and efficiency. Recent advances in neural operators, a kind of mesh-independent neural-network-based PDE solvers, have suggested the dawn of overcoming this challenge. In this emerging direction, Koopman neural operator (KNO) is a representative demonstration and outperforms other state-of-the-art alternatives in terms of accuracy and efficiency. Here we present KoopmanLab, a self-contained and user-friendly PyTorch module of the Koopman neural operator family for solving partial differential equations. Beyond the original version of KNO, we develop multiple new variants of KNO based on different neural network architectures to improve the general applicability of our module. These variants are validated by mesh-independent and long-term prediction experiments implemented on representative PDEs (e.g., the Navier-Stokes equation and the Bateman-Burgers equation) and ERA5 (i.e., one of the largest high-resolution data sets of global-scale climate fields). These demonstrations suggest the potential of KoopmanLab to be considered in diverse applications of partial differential equations.
translated by 谷歌翻译
A recent study has shown a phenomenon called neural collapse in that the within-class means of features and the classifier weight vectors converge to the vertices of a simplex equiangular tight frame at the terminal phase of training for classification. In this paper, we explore the corresponding structures of the last-layer feature centers and classifiers in semantic segmentation. Based on our empirical and theoretical analysis, we point out that semantic segmentation naturally brings contextual correlation and imbalanced distribution among classes, which breaks the equiangular and maximally separated structure of neural collapse for both feature centers and classifiers. However, such a symmetric structure is beneficial to discrimination for the minor classes. To preserve these advantages, we introduce a regularizer on feature centers to encourage the network to learn features closer to the appealing structure in imbalanced semantic segmentation. Experimental results show that our method can bring significant improvements on both 2D and 3D semantic segmentation benchmarks. Moreover, our method ranks 1st and sets a new record (+6.8% mIoU) on the ScanNet200 test leaderboard. Code will be available at https://github.com/dvlab-research/Imbalanced-Learning.
translated by 谷歌翻译
Benefiting from the intrinsic supervision information exploitation capability, contrastive learning has achieved promising performance in the field of deep graph clustering recently. However, we observe that two drawbacks of the positive and negative sample construction mechanisms limit the performance of existing algorithms from further improvement. 1) The quality of positive samples heavily depends on the carefully designed data augmentations, while inappropriate data augmentations would easily lead to the semantic drift and indiscriminative positive samples. 2) The constructed negative samples are not reliable for ignoring important clustering information. To solve these problems, we propose a Cluster-guided Contrastive deep Graph Clustering network (CCGC) by mining the intrinsic supervision information in the high-confidence clustering results. Specifically, instead of conducting complex node or edge perturbation, we construct two views of the graph by designing special Siamese encoders whose weights are not shared between the sibling sub-networks. Then, guided by the high-confidence clustering information, we carefully select and construct the positive samples from the same high-confidence cluster in two views. Moreover, to construct semantic meaningful negative sample pairs, we regard the centers of different high-confidence clusters as negative samples, thus improving the discriminative capability and reliability of the constructed sample pairs. Lastly, we design an objective function to pull close the samples from the same cluster while pushing away those from other clusters by maximizing and minimizing the cross-view cosine similarity between positive and negative samples. Extensive experimental results on six datasets demonstrate the effectiveness of CCGC compared with the existing state-of-the-art algorithms.
translated by 谷歌翻译